Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
J Biomed Inform ; 144: 104444, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37451494

RESUMEN

INTRODUCTION: Clinical trials (CTs) often fail due to inadequate patient recruitment. Finding eligible patients involves comparing the patient's information with the CT eligibility criteria. Automated patient matching offers the promise of improving the process, yet the main difficulties of CT retrieval lie in the semantic complexity of matching unstructured patient descriptions with semi-structured, multi-field CT documents and in capturing the meaning of negation coming from the eligibility criteria. OBJECTIVES: This paper tackles the challenges of CT retrieval by presenting an approach that addresses the patient-to-trials paradigm. Our approach involves two key components in a pipeline-based model: (i) a data enrichment technique for enhancing both queries and documents during the first retrieval stage, and (ii) a novel re-ranking schema that uses a Transformer network in a setup adapted to this task by leveraging the structure of the CT documents. METHODS: We use named entity recognition and negation detection in both patient description and the eligibility section of CTs. We further classify patient descriptions and CT eligibility criteria into current, past, and family medical conditions. This extracted information is used to boost the importance of disease and drug mentions in both query and index for lexical retrieval. Furthermore, we propose a two-step training schema for the Transformer network used to re-rank the results from the lexical retrieval. The first step focuses on matching patient information with the descriptive sections of trials, while the second step aims to determine eligibility by matching patient information with the criteria section. RESULTS: Our findings indicate that the inclusion criteria section of the CT has a great influence on the relevance score in lexical models, and that the enrichment techniques for queries and documents improve the retrieval of relevant trials. The re-ranking strategy, based on our training schema, consistently enhances CT retrieval and shows improved performance by 15% in terms of precision at retrieving eligible trials. CONCLUSION: The results of our experiments suggest the benefit of making use of extracted entities. Moreover, our proposed re-ranking schema shows promising effectiveness compared to larger neural models, even with limited training data. These findings offer valuable insights for improving methods for retrieval of clinical documents.


Asunto(s)
Almacenamiento y Recuperación de la Información , Semántica , Humanos
2.
J Chem Theory Comput ; 18(1): 441-447, 2022 Jan 11.
Artículo en Inglés | MEDLINE | ID: mdl-34919396

RESUMEN

Benchmarking DFT functionals is complicated since the results highly depend on which properties and materials were used in the process. Unwanted biases can be introduced if a data set contains too many examples of very similar materials. We show that a clustering based on the distribution of density gradient and kinetic energy density is able to identify groups of chemically distinct solids. We then propose a method to create smaller data sets or rebalance existing data sets in a way that no region of the meta-GGA descriptor space is overrepresented, yet the new data set reproduces average errors of the original set as closely as possible. We apply the method to an existing set of 44 inorganic solids and suggest a representative set of seven solids. The representative sets generated with this method can be used to make more general benchmarks or to train new functionals.

3.
Sci Rep ; 11(1): 19241, 2021 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-34584107

RESUMEN

Behavioral gender differences have been found for a wide range of human activities including the way people communicate, move, provision themselves, or organize leisure activities. Using mobile phone data from 1.2 million devices in Austria (15% of the population) across the first phase of the COVID-19 crisis, we quantify gender-specific patterns of communication intensity, mobility, and circadian rhythms. We show the resilience of behavioral patterns with respect to the shock imposed by a strict nation-wide lock-down that Austria experienced in the beginning of the crisis with severe implications on public and private life. We find drastic differences in gender-specific responses during the different phases of the pandemic. After the lock-down gender differences in mobility and communication patterns increased massively, while circadian rhythms tended to synchronize. In particular, women had fewer but longer phone calls than men during the lock-down. Mobility declined massively for both genders, however, women tended to restrict their movement stronger than men. Women showed a stronger tendency to avoid shopping centers and more men frequented recreational areas. After the lock-down, males returned back to normal quicker than women; young age-cohorts return much quicker. Differences are driven by the young and adolescent population. An age stratification highlights the role of retirement on behavioral differences. We find that the length of a day of men and women is reduced by 1 h. We interpret and discuss these findings as signals for underlying social, biological and psychological gender differences when coping with crisis and taking risks.


Asunto(s)
Conducta/fisiología , COVID-19 , Factores Sexuales , Encuestas y Cuestionarios , Factores de Edad , Austria , Teléfono Celular , Ritmo Circadiano , Comunicación , Femenino , Humanos , Actividades Recreativas , Masculino , Pandemias
4.
IEEE Trans Med Imaging ; 40(7): 1934-1949, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-33784615

RESUMEN

Separating and labeling each nuclear instance (instance-aware segmentation) is the key challenge in nuclear image segmentation. Deep Convolutional Neural Networks have been demonstrated to solve nuclear image segmentation tasks across different imaging modalities, but a systematic comparison on complex immunofluorescence images has not been performed. Deep learning based segmentation requires annotated datasets for training, but annotated fluorescence nuclear image datasets are rare and of limited size and complexity. In this work, we evaluate and compare the segmentation effectiveness of multiple deep learning architectures (U-Net, U-Net ResNet, Cellpose, Mask R-CNN, KG instance segmentation) and two conventional algorithms (Iterative h-min based watershed, Attributed relational graphs) on complex fluorescence nuclear images of various types. We propose and evaluate a novel strategy to create artificial images to extend the training set. Results show that instance-aware segmentation architectures and Cellpose outperform the U-Net architectures and conventional methods on complex images in terms of F1 scores, while the U-Net architectures achieve overall higher mean Dice scores. Training with additional artificially generated images improves recall and F1 scores for complex images, thereby leading to top F1 scores for three out of five sample preparation types. Mask R-CNN trained on artificial images achieves the overall highest F1 score on complex images of similar conditions to the training set images while Cellpose achieves the overall highest F1 score on complex images of new imaging conditions. We provide quantitative results demonstrating that images annotated by under-graduates are sufficient for training instance-aware segmentation architectures to efficiently segment complex fluorescence nuclear images.


Asunto(s)
Aprendizaje Profundo , Algoritmos , Técnica del Anticuerpo Fluorescente , Procesamiento de Imagen Asistido por Computador , Redes Neurales de la Computación
5.
BMJ Evid Based Med ; 26(1): 24-27, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-31467247

RESUMEN

Evidence synthesis is a key element of evidence-based medicine. However, it is currently hampered by being labour intensive meaning that many trials are not incorporated into robust evidence syntheses and that many are out of date. To overcome this, a variety of techniques are being explored, including using automation technology. Here, we describe a fully automated evidence synthesis system for intervention studies, one that identifies all the relevant evidence, assesses the evidence for reliability and collates it to estimate the relative effectiveness of an intervention. Techniques used include machine learning, natural language processing and rule-based systems. Results are visualised using modern visualisation techniques. We believe this to be the first, publicly available, automated evidence synthesis system: an evidence mapping tool that synthesises evidence on the fly.


Asunto(s)
Aprendizaje Automático , Procesamiento de Lenguaje Natural , Automatización , Humanos , Reproducibilidad de los Resultados
6.
Med Image Anal ; 66: 101796, 2020 12.
Artículo en Inglés | MEDLINE | ID: mdl-32911207

RESUMEN

The number of biomedical image analysis challenges organized per year is steadily increasing. These international competitions have the purpose of benchmarking algorithms on common data sets, typically to identify the best method for a given problem. Recent research, however, revealed that common practice related to challenge reporting does not allow for adequate interpretation and reproducibility of results. To address the discrepancy between the impact of challenges and the quality (control), the Biomedical Image Analysis ChallengeS (BIAS) initiative developed a set of recommendations for the reporting of challenges. The BIAS statement aims to improve the transparency of the reporting of a biomedical image analysis challenge regardless of field of application, image modality or task category assessed. This article describes how the BIAS statement was developed and presents a checklist which authors of biomedical image analysis challenges are encouraged to include in their submission when giving a paper on a challenge into review. The purpose of the checklist is to standardize and facilitate the review process and raise interpretability and reproducibility of challenge results by making relevant information explicit.


Asunto(s)
Investigación Biomédica , Lista de Verificación , Humanos , Pronóstico , Reproducibilidad de los Resultados
7.
Sci Data ; 7(1): 262, 2020 08 11.
Artículo en Inglés | MEDLINE | ID: mdl-32782410

RESUMEN

Fully-automated nuclear image segmentation is the prerequisite to ensure statistically significant, quantitative analyses of tissue preparations,applied in digital pathology or quantitative microscopy. The design of segmentation methods that work independently of the tissue type or preparation is complex, due to variations in nuclear morphology, staining intensity, cell density and nuclei aggregations. Machine learning-based segmentation methods can overcome these challenges, however high quality expert-annotated images are required for training. Currently, the limited number of annotated fluorescence image datasets publicly available do not cover a broad range of tissues and preparations. We present a comprehensive, annotated dataset including tightly aggregated nuclei of multiple tissues for the training of machine learning-based nuclear segmentation algorithms. The proposed dataset covers sample preparation methods frequently used in quantitative immunofluorescence microscopy. We demonstrate the heterogeneity of the dataset with respect to multiple parameters such as magnification, modality, signal-to-noise ratio and diagnosis. Based on a suggested split into training and test sets and additional single-nuclei expert annotations, machine learning-based image segmentation methods can be trained and evaluated.


Asunto(s)
Fluorescencia , Procesamiento de Imagen Asistido por Computador , Aprendizaje Automático , Microscopía Fluorescente , Algoritmos , Humanos
9.
J Med Internet Res ; 21(1): e10986, 2019 01 30.
Artículo en Inglés | MEDLINE | ID: mdl-30698536

RESUMEN

BACKGROUND: Understandability plays a key role in ensuring that people accessing health information are capable of gaining insights that can assist them with their health concerns and choices. The access to unclear or misleading information has been shown to negatively impact the health decisions of the general public. OBJECTIVE: The aim of this study was to investigate methods to estimate the understandability of health Web pages and use these to improve the retrieval of information for people seeking health advice on the Web. METHODS: Our investigation considered methods to automatically estimate the understandability of health information in Web pages, and it provided a thorough evaluation of these methods using human assessments as well as an analysis of preprocessing factors affecting understandability estimations and associated pitfalls. Furthermore, lessons learned for estimating Web page understandability were applied to the construction of retrieval methods, with specific attention to retrieving information understandable by the general public. RESULTS: We found that machine learning techniques were more suitable to estimate health Web page understandability than traditional readability formulae, which are often used as guidelines and benchmark by health information providers on the Web (larger difference found for Pearson correlation of .602 using gradient boosting regressor compared with .438 using Simple Measure of Gobbledygook Index with the Conference and Labs of the Evaluation Forum eHealth 2015 collection). CONCLUSIONS: The findings reported in this paper are important for specialized search services tailored to support the general public in seeking health advice on the Web, as they document and empirically validate state-of-the-art techniques and settings for this domain application.


Asunto(s)
Almacenamiento y Recuperación de la Información/métodos , Internet , Algoritmos , Comprensión , Humanos
10.
Nat Commun ; 9(1): 5217, 2018 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-30523263

RESUMEN

International challenges have become the standard for validation of biomedical image analysis methods. Given their scientific impact, it is surprising that a critical analysis of common practices related to the organization of challenges has not yet been performed. In this paper, we present a comprehensive analysis of biomedical image analysis challenges conducted up to now. We demonstrate the importance of challenges and show that the lack of quality control has critical consequences. First, reproducibility and interpretation of the results is often hampered as only a fraction of relevant information is typically provided. Second, the rank of an algorithm is generally not robust to a number of variables such as the test data used for validation, the ranking scheme applied and the observers that make the reference annotations. To overcome these problems, we recommend best practice guidelines and define open research questions to be addressed in the future.


Asunto(s)
Tecnología Biomédica/métodos , Diagnóstico por Imagen/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Evaluación de la Tecnología Biomédica/métodos , Investigación Biomédica/métodos , Investigación Biomédica/normas , Tecnología Biomédica/clasificación , Tecnología Biomédica/normas , Diagnóstico por Imagen/clasificación , Diagnóstico por Imagen/normas , Humanos , Procesamiento de Imagen Asistido por Computador/normas , Reproducibilidad de los Resultados , Encuestas y Cuestionarios , Evaluación de la Tecnología Biomédica/normas
11.
Inf Retr Boston ; 21(6): 565-596, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30416369

RESUMEN

Every information retrieval (IR) model embeds in its scoring function a form of term frequency (TF) quantification. The contribution of the term frequency is determined by the properties of the function of the chosen TF quantification, and by its TF normalization. The first defines how independent the occurrences of multiple terms are, while the second acts on mitigating the a priori probability of having a high term frequency in a document (estimation usually based on the document length). New test collections, coming from different domains (e.g. medical, legal), give evidence that not only document length, but in addition, verboseness of documents should be explicitly considered. Therefore we propose and investigate a systematic combination of document verboseness and length. To theoretically justify the combination, we show the duality between document verboseness and length. In addition, we investigate the duality between verboseness and other components of IR models. We test these new TF normalizations on four suitable test collections. We do this on a well defined spectrum of TF quantifications. Finally, based on the theoretical and experimental observations, we show how the two components of this new normalization, document verboseness and length, interact with each other. Our experiments demonstrate that the new models never underperform existing models, while sometimes introducing statistically significantly better results, at no additional computational cost.

12.
Stud Health Technol Inform ; 247: 146-150, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29677940

RESUMEN

In this paper, an identification approach for the Population (e.g. patients with headache), the Intervention (e.g. aspirin) and the Comparison (e.g. vitamin C) in Randomized Controlled Trials (RCTs) is proposed. Contrary to previous approaches, the identification is done on a word level, rather than on a sentence level. Additionally, we classify the sentiment of RCTs to determine whether an Intervention is more effective than its Comparison. Two new corpora were created to evaluate both approaches. In the experiments, an average F1 score of 0.85 for the PIC identification and 0.72 for the sentiment classification was achieved.


Asunto(s)
Minería de Datos , Ensayos Clínicos Controlados Aleatorios como Asunto , Proyectos de Investigación , Humanos
13.
Stud Health Technol Inform ; 245: 1004-1008, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29295252

RESUMEN

Accessing online health content of high quality and reliability presents challenges. Laypersons cannot easily differentiate trustworthy content from misinformed or manipulated content. This article describes complementary approaches for members of the general public and health professionals to find trustworthy content with as little bias as possible. These include the Khresmoi health search engine (K4E), the Health On the Net Code of Conduct (HONcode) and health trust indicator Web browser extensions.


Asunto(s)
Internet , Motor de Búsqueda , Informática Aplicada a la Salud de los Consumidores , Humanos , Reproducibilidad de los Resultados
15.
Genome Med ; 8(1): 71, 2016 06 23.
Artículo en Inglés | MEDLINE | ID: mdl-27338147

RESUMEN

Medicine and healthcare are undergoing profound changes. Whole-genome sequencing and high-resolution imaging technologies are key drivers of this rapid and crucial transformation. Technological innovation combined with automation and miniaturization has triggered an explosion in data production that will soon reach exabyte proportions. How are we going to deal with this exponential increase in data production? The potential of "big data" for improving health is enormous but, at the same time, we face a wide range of challenges to overcome urgently. Europe is very proud of its cultural diversity; however, exploitation of the data made available through advances in genomic medicine, imaging, and a wide range of mobile health applications or connected devices is hampered by numerous historical, technical, legal, and political barriers. European health systems and databases are diverse and fragmented. There is a lack of harmonization of data formats, processing, analysis, and data transfer, which leads to incompatibilities and lost opportunities. Legal frameworks for data sharing are evolving. Clinicians, researchers, and citizens need improved methods, tools, and training to generate, analyze, and query data effectively. Addressing these barriers will contribute to creating the European Single Market for health, which will improve health and healthcare for all Europeans.


Asunto(s)
Investigación Biomédica/legislación & jurisprudencia , Bases de Datos Factuales/normas , Unión Europea/organización & administración , Investigación Biomédica/normas , Bases de Datos Factuales/legislación & jurisprudencia , Implementación de Plan de Salud , Humanos , Difusión de la Información/legislación & jurisprudencia
16.
IEEE Trans Med Imaging ; 35(11): 2459-2475, 2016 11.
Artículo en Inglés | MEDLINE | ID: mdl-27305669

RESUMEN

Variations in the shape and appearance of anatomical structures in medical images are often relevant radiological signs of disease. Automatic tools can help automate parts of this manual process. A cloud-based evaluation framework is presented in this paper including results of benchmarking current state-of-the-art medical imaging algorithms for anatomical structure segmentation and landmark detection: the VISCERAL Anatomy benchmarks. The algorithms are implemented in virtual machines in the cloud where participants can only access the training data and can be run privately by the benchmark administrators to objectively compare their performance in an unseen common test set. Overall, 120 computed tomography and magnetic resonance patient volumes were manually annotated to create a standard Gold Corpus containing a total of 1295 structures and 1760 landmarks. Ten participants contributed with automatic algorithms for the organ segmentation task, and three for the landmark localization task. Different algorithms obtained the best scores in the four available imaging modalities and for subsets of anatomical structures. The annotation framework, resulting data set, evaluation setup, results and performance analysis from the three VISCERAL Anatomy benchmarks are presented in this article. Both the VISCERAL data set and Silver Corpus generated with the fusion of the participant algorithms on a larger set of non-manually-annotated medical images are available to the research community.


Asunto(s)
Algoritmos , Puntos Anatómicos de Referencia/diagnóstico por imagen , Anatomía/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Anciano , Femenino , Humanos , Imagen por Resonancia Magnética , Masculino , Persona de Mediana Edad , Tomografía Computarizada por Rayos X
17.
IEEE Trans Pattern Anal Mach Intell ; 37(11): 2153-63, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26440258

RESUMEN

The Hausdorff distance (HD) between two point sets is a commonly used dissimilarity measure for comparing point sets and image segmentations. Especially when very large point sets are compared using the HD, for example when evaluating magnetic resonance volume segmentations, or when the underlying applications are based on time critical tasks, like motion detection, then the computational complexity of HD algorithms becomes an important issue. In this paper we propose a novel efficient algorithm for computing the exact Hausdorff distance. In a runtime analysis, the proposed algorithm is demonstrated to have nearly-linear complexity. Furthermore, it has efficient performance for large point set sizes as well as for large grid size; performs equally for sparse and dense point sets; and finally it is general without restrictions on the characteristics of the point set. The proposed algorithm is tested against the HD algorithm of the widely used national library of medicine insight segmentation and registration toolkit (ITK) using magnetic resonance volumes with extremely large size. The proposed algorithm outperforms the ITK HD algorithm both in speed and memory required. In an experiment using trajectories from a road network, the proposed algorithm significantly outperforms an HD algorithm based on R-Trees.

18.
BMC Med Imaging ; 15: 29, 2015 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-26263899

RESUMEN

BACKGROUND: Medical Image segmentation is an important image processing step. Comparing images to evaluate the quality of segmentation is an essential part of measuring progress in this research area. Some of the challenges in evaluating medical segmentation are: metric selection, the use in the literature of multiple definitions for certain metrics, inefficiency of the metric calculation implementations leading to difficulties with large volumes, and lack of support for fuzzy segmentation by existing metrics. RESULT: First we present an overview of 20 evaluation metrics selected based on a comprehensive literature review. For fuzzy segmentation, which shows the level of membership of each voxel to multiple classes, fuzzy definitions of all metrics are provided. We present a discussion about metric properties to provide a guide for selecting evaluation metrics. Finally, we propose an efficient evaluation tool implementing the 20 selected metrics. The tool is optimized to perform efficiently in terms of speed and required memory, also if the image size is extremely large as in the case of whole body MRI or CT volume segmentation. An implementation of this tool is available as an open source project. CONCLUSION: We propose an efficient evaluation tool for 3D medical image segmentation using 20 evaluation metrics and provide guidelines for selecting a subset of these metrics that is suitable for the data and the segmentation task.


Asunto(s)
Neoplasias Encefálicas/patología , Procesamiento de Imagen Asistido por Computador/métodos , Imagenología Tridimensional/métodos , Algoritmos , Tomografía Computarizada de Haz Cónico/métodos , Lógica Difusa , Humanos , Imagen por Resonancia Magnética/métodos
19.
Stud Health Technol Inform ; 205: 358-62, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25160206

RESUMEN

The World Wide Web has become an important source of information for medical practitioners. To complement the capabilities of currently available web search engines we developed FindMeEvidence, an open-source, mobile-friendly medical search engine. In a preliminary evaluation, the quality of results from FindMeEvidence proved to be competitive with those from TRIP Database, an established, closed-source search engine for evidence-based medicine.


Asunto(s)
Información de Salud al Consumidor/métodos , Minería de Datos/métodos , Difusión de la Información/métodos , Internet , Motor de Búsqueda/métodos , Programas Informáticos , Interfaz Usuario-Computador , Computadoras de Mano , Medicina Basada en la Evidencia , Diseño de Software
20.
IEEE Trans Vis Comput Graph ; 20(12): 1703-12, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26356884

RESUMEN

Multi-class classifiers often compute scores for the classification samples describing probabilities to belong to different classes. In order to improve the performance of such classifiers, machine learning experts need to analyze classification results for a large number of labeled samples to find possible reasons for incorrect classification. Confusion matrices are widely used for this purpose. However, they provide no information about classification scores and features computed for the samples. We propose a set of integrated visual methods for analyzing the performance of probabilistic classifiers. Our methods provide insight into different aspects of the classification results for a large number of samples. One visualization emphasizes at which probabilities these samples were classified and how these probabilities correlate with classification error in terms of false positives and false negatives. Another view emphasizes the features of these samples and ranks them by their separation power between selected true and false classifications. We demonstrate the insight gained using our technique in a benchmarking classification dataset, and show how it enables improving classification performance by interactively defining and evaluating post-classification rules.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...